91 research outputs found

    Efficient O~(n/ϵ)\widetilde{O}(n/\epsilon) Spectral Sketches for the Laplacian and its Pseudoinverse

    Full text link
    In this paper we consider the problem of efficiently computing ϵ\epsilon-sketches for the Laplacian and its pseudoinverse. Given a Laplacian and an error tolerance ϵ\epsilon, we seek to construct a function ff such that for any vector xx (chosen obliviously from ff), with high probability (1ϵ)xAxf(x)(1+ϵ)xAx(1-\epsilon) x^\top A x \leq f(x) \leq (1 + \epsilon) x^\top A x where AA is either the Laplacian or its pseudoinverse. Our goal is to construct such a sketch ff efficiently and to store it in the least space possible. We provide nearly-linear time algorithms that, when given a Laplacian matrix LRn×n\mathcal{L} \in \mathbb{R}^{n \times n} and an error tolerance ϵ\epsilon, produce O~(n/ϵ)\tilde{O}(n/\epsilon)-size sketches of both L\mathcal{L} and its pseudoinverse. Our algorithms improve upon the previous best sketch size of O~(n/ϵ1.6)\widetilde{O}(n / \epsilon^{1.6}) for sketching the Laplacian form by Andoni et al (2015) and O(n/ϵ2)O(n / \epsilon^2) for sketching the Laplacian pseudoinverse by Batson, Spielman, and Srivastava (2008). Furthermore we show how to compute all-pairs effective resistances from O~(n/ϵ)\widetilde{O}(n/\epsilon) size sketch in O~(n2/ϵ)\widetilde{O}(n^2/\epsilon) time. This improves upon the previous best running time of O~(n2/ϵ2)\widetilde{O}(n^2/\epsilon^2) by Spielman and Srivastava (2008).Comment: Accepted to SODA 2018; v2 fixes a small bug in the proof of lemma 3. This does not affect correctness of any of our result

    Exploiting Numerical Sparsity for Efficient Learning : Faster Eigenvector Computation and Regression

    Full text link
    In this paper, we obtain improved running times for regression and top eigenvector computation for numerically sparse matrices. Given a data matrix ARn×dA \in \mathbb{R}^{n \times d} where every row aRda \in \mathbb{R}^d has a22L\|a\|_2^2 \leq L and numerical sparsity at most ss, i.e. a12/a22s\|a\|_1^2 / \|a\|_2^2 \leq s, we provide faster algorithms for these problems in many parameter settings. For top eigenvector computation, we obtain a running time of O~(nd+r(s+rs)/gap2)\tilde{O}(nd + r(s + \sqrt{r s}) / \mathrm{gap}^2) where gap>0\mathrm{gap} > 0 is the relative gap between the top two eigenvectors of AAA^\top A and rr is the stable rank of AA. This running time improves upon the previous best unaccelerated running time of O(nd+rd/gap2)O(nd + r d / \mathrm{gap}^2) as it is always the case that rdr \leq d and sds \leq d. For regression, we obtain a running time of O~(nd+(nL/μ)snL/μ)\tilde{O}(nd + (nL / \mu) \sqrt{s nL / \mu}) where μ>0\mu > 0 is the smallest eigenvalue of AAA^\top A. This running time improves upon the previous best unaccelerated running time of O~(nd+nLd/μ)\tilde{O}(nd + n L d / \mu). This result expands the regimes where regression can be solved in nearly linear time from when L/μ=O~(1)L/\mu = \tilde{O}(1) to when L/μ=O~(d2/3/(sn)1/3)L / \mu = \tilde{O}(d^{2/3} / (sn)^{1/3}). Furthermore, we obtain similar improvements even when row norms and numerical sparsities are non-uniform and we show how to achieve even faster running times by accelerating using approximate proximal point [Frostig et. al. 2015] / catalyst [Lin et. al. 2015]. Our running times depend only on the size of the input and natural numerical measures of the matrix, i.e. eigenvalues and p\ell_p norms, making progress on a key open problem regarding optimal running times for efficient large-scale learning.Comment: To appear in NIPS 201

    Coordinate Methods for Accelerating \ell_\infty Regression and Faster Approximate Maximum Flow

    Full text link
    We provide faster algorithms for approximately solving \ell_{\infty} regression, a fundamental problem prevalent in both combinatorial and continuous optimization. In particular, we provide accelerated coordinate descent methods capable of provably exploiting dynamic measures of coordinate smoothness, and apply them to \ell_\infty regression over a box to give algorithms which converge in kk iterations at a O(1/k)O(1/k) rate. Our algorithms can be viewed as an alternative approach to the recent breakthrough result of Sherman [She17] which achieves a similar runtime improvement over classic algorithmic approaches, i.e. smoothing and gradient descent, which either converge at a O(1/k)O(1/\sqrt{k}) rate or have running times with a worse dependence on problem parameters. Our runtimes match those of [She17] across a broad range of parameters and achieve improvement in certain structured cases. We demonstrate the efficacy of our result by providing faster algorithms for the well-studied maximum flow problem. Directly leveraging our accelerated \ell_\infty regression algorithms imply a O~(m+mn/ϵ)\tilde{O}\left(m + \sqrt{mn}/\epsilon\right) runtime to compute an ϵ\epsilon-approximate maximum flow for an undirected graph with mm edges and nn vertices, generically improving upon the previous best known runtime of O~(m/ϵ)\tilde{O}\left(m/\epsilon\right) in [She17] whenever the graph is slightly dense. We further design an algorithm adapted to the structure of the regression problem induced by maximum flow obtaining a runtime of O~(m+max(n,ns)/ϵ)\tilde{O}\left(m + \max(n, \sqrt{ns})/\epsilon\right), where ss is the squared 2\ell_2 norm of the congestion of any optimal flow. Moreover, we show how to leverage this result to achieve improved exact algorithms for maximum flow on a variety of unit capacity graphs. We hope that our work serves as an important step towards achieving even faster maximum flow algorithms.Comment: A preliminary version appeared in FOCS 2018, with an error in the accelerated coordinate descent proof. Originally we claimed m+ns/ϵm + \sqrt{ns}/\epsilon for our approximate maximum flow runtime; this version obtains m+(n+ns)/ϵm + (n + \sqrt{ns})/\epsilon. The \ell_\infty regression results have been substantially improved, with dependence cc on column sparsity (formerly c2.5c^{2.5}

    Path Finding I :Solving Linear Programs with \~O(sqrt(rank)) Linear System Solves

    Full text link
    In this paper we present a new algorithm for solving linear programs that requires only O~(rank(A)L)\tilde{O}(\sqrt{rank(A)}L) iterations to solve a linear program with mm constraints, nn variables, and constraint matrix AA, and bit complexity LL. Each iteration of our method consists of solving O~(1)\tilde{O}(1) linear systems and additional nearly linear time computation. Our method improves upon the previous best iteration bound by factor of Ω~((m/rank(A))1/4)\tilde{\Omega}((m/rank(A))^{1/4}) for methods with polynomial time computable iterations and by Ω~((m/rank(A))1/2)\tilde{\Omega}((m/rank(A))^{1/2}) for methods which solve at most O~(1)\tilde{O}(1) linear systems in each iteration. Our method is parallelizable and amenable to linear algebraic techniques for accelerating the linear system solver. As such, up to polylogarithmic factors we either match or improve upon the best previous running times in both depth and work for different ratios of mm and rank(A)rank(A). Moreover, our method matches up to polylogarithmic factors a theoretical limit established by Nesterov and Nemirovski in 1994 regarding the use of a "universal barrier" for interior point methods, thereby resolving a long-standing open question regarding the running time of polynomial time interior point methods for linear programming

    Efficient Accelerated Coordinate Descent Methods and Faster Algorithms for Solving Linear Systems

    Full text link
    In this paper we show how to accelerate randomized coordinate descent methods and achieve faster convergence rates without paying per-iteration costs in asymptotic running time. In particular, we show how to generalize and efficiently implement a method proposed by Nesterov, giving faster asymptotic running times for various algorithms that use standard coordinate descent as a black box. In addition to providing a proof of convergence for this new general method, we show that it is numerically stable, efficiently implementable, and in certain regimes, asymptotically optimal. To highlight the computational power of this algorithm, we show how it can used to create faster linear system solvers in several regimes: - We show how this method achieves a faster asymptotic runtime than conjugate gradient for solving a broad class of symmetric positive definite systems of equations. - We improve the best known asymptotic convergence guarantees for Kaczmarz methods, a popular technique for image reconstruction and solving overdetermined systems of equations, by accelerating a randomized algorithm of Strohmer and Vershynin. - We achieve the best known running time for solving Symmetric Diagonally Dominant (SDD) system of equations in the unit-cost RAM model, obtaining an O(m log^{3/2} n (log log n)^{1/2} log (log n / eps)) asymptotic running time by accelerating a recent solver by Kelner et al. Beyond the independent interest of these solvers, we believe they highlight the versatility of the approach of this paper and we hope that they will open the door for further algorithmic improvements in the future

    Efficient Inverse Maintenance and Faster Algorithms for Linear Programming

    Full text link
    In this paper, we consider the following inverse maintenance problem: given ARn×dA \in \mathbb{R}^{n\times d} and a number of rounds rr, we receive a n×nn\times n diagonal matrix D(k)D^{(k)} at round kk and we wish to maintain an efficient linear system solver for ATD(k)AA^{T}D^{(k)}A under the assumption D(k)D^{(k)} does not change too rapidly. This inverse maintenance problem is the computational bottleneck in solving multiple optimization problems. We show how to solve this problem with O~(nnz(A)+dω)\tilde{O}(nnz(A)+d^{\omega}) preprocessing time and amortized O~(nnz(A)+d2)\tilde{O}(nnz(A)+d^{2}) time per round, improving upon previous running times for solving this problem. Consequently, we obtain the fastest known running times for solving multiple problems including, linear programming and computing a rounding of a polytope. In particular given a feasible point in a linear program with dd variables, nn constraints, and constraint matrix ARn×dA\in\mathbb{R}^{n\times d}, we show how to solve the linear program in time O~(nnz(A)+d2)dlog(ϵ1))\tilde{O}(nnz(A)+d^{2})\sqrt{d}\log(\epsilon^{-1})). We achieve our results through a novel combination of classic numerical techniques of low rank update, preconditioning, and fast matrix multiplication as well as recent work on subspace embeddings and spectral sparsification that we hope will be of independent interest.Comment: In an older version of this paper, we mistakenly claimed an improved running time for Dikin walk by noting solely the improved running time for linear system solving and ignoring the determinant computatio

    Efficient Profile Maximum Likelihood for Universal Symmetric Property Estimation

    Full text link
    Estimating symmetric properties of a distribution, e.g. support size, coverage, entropy, distance to uniformity, are among the most fundamental problems in algorithmic statistics. While each of these properties have been studied extensively and separate optimal estimators are known for each, in striking recent work, Acharya et al. 2016 showed that there is a single estimator that is competitive for all symmetric properties. This work proved that computing the distribution that approximately maximizes \emph{profile likelihood (PML)}, i.e. the probability of observed frequency of frequencies, and returning the value of the property on this distribution is sample competitive with respect to a broad class of estimators of symmetric properties. Further, they showed that even computing an approximation of the PML suffices to achieve such a universal plug-in estimator. Unfortunately, prior to this work there was no known polynomial time algorithm to compute an approximate PML and it was open to obtain a polynomial time universal plug-in estimator through the use of approximate PML. In this paper we provide a algorithm (in number of samples) that, given nn samples from a distribution, computes an approximate PML distribution up to a multiplicative error of exp(n2/3polylog(n))\exp(n^{2/3} \mathrm{poly} \log(n)) in time nearly linear in nn. Generalizing work of Acharya et al. 2016 on the utility of approximate PML we show that our algorithm provides a nearly linear time universal plug-in estimator for all symmetric functions up to accuracy ϵ=Ω(n0.166)\epsilon = \Omega(n^{-0.166}). Further, we show how to extend our work to provide efficient polynomial-time algorithms for computing a dd-dimensional generalization of PML (for constant dd) that allows for universal plug-in estimation of symmetric relationships between distributions.Comment: 68 page

    Stability of the Lanczos Method for Matrix Function Approximation

    Full text link
    The ubiquitous Lanczos method can approximate f(A)xf(A)x for any symmetric n×nn \times n matrix AA, vector xx, and function ff. In exact arithmetic, the method's error after kk iterations is bounded by the error of the best degree-kk polynomial uniformly approximating f(x)f(x) on the range [λmin(A),λmax(A)][\lambda_{min}(A), \lambda_{max}(A)]. However, despite decades of work, it has been unclear if this powerful guarantee holds in finite precision. We resolve this problem, proving that when maxx[λmin,λmax]f(x)C\max_{x \in [\lambda_{min}, \lambda_{max}]}|f(x)| \le C, Lanczos essentially matches the exact arithmetic guarantee if computations use roughly log(nCA)\log(nC\|A\|) bits of precision. Our proof extends work of Druskin and Knizhnerman [DK91], leveraging the stability of the classic Chebyshev recurrence to bound the stability of any polynomial approximating f(x)f(x). We also study the special case of f(A)=A1f(A) = A^{-1}, where stronger guarantees hold. In exact arithmetic Lanczos performs as well as the best polynomial approximating 1/x1/x at each of AA's eigenvalues, rather than on the full eigenvalue range. In seminal work, Greenbaum gives an approach to extending this bound to finite precision: she proves that finite precision Lanczos and the related CG method match any polynomial approximating 1/x1/x in a tiny range around each eigenvalue [Gre89]. For A1A^{-1}, this bound appears stronger than ours. However, we exhibit matrices with condition number κ\kappa where exact arithmetic Lanczos converges in polylog(κ)polylog(\kappa) iterations, but Greenbaum's bound predicts Ω(κ1/5)\Omega(\kappa^{1/5}) iterations. It thus cannot offer significant improvement over the O(κ1/2)O(\kappa^{1/2}) bound achievable via our result. Our analysis raises the question of if convergence in less than poly(κ)poly(\kappa) iterations can be expected in finite precision, even for matrices with clustered, skewed, or otherwise favorable eigenvalue distributions

    Memory-Sample Tradeoffs for Linear Regression with Small Error

    Full text link
    We consider the problem of performing linear regression over a stream of dd-dimensional examples, and show that any algorithm that uses a subquadratic amount of memory exhibits a slower rate of convergence than can be achieved without memory constraints. Specifically, consider a sequence of labeled examples (a1,b1),(a2,b2),(a_1,b_1), (a_2,b_2)\ldots, with aia_i drawn independently from a dd-dimensional isotropic Gaussian, and where bi=ai,x+ηi,b_i = \langle a_i, x\rangle + \eta_i, for a fixed xRdx \in \mathbb{R}^d with x2=1\|x\|_2 = 1 and with independent noise ηi\eta_i drawn uniformly from the interval [2d/5,2d/5].[-2^{-d/5},2^{-d/5}]. We show that any algorithm with at most d2/4d^2/4 bits of memory requires at least Ω(dloglog1ϵ)\Omega(d \log \log \frac{1}{\epsilon}) samples to approximate xx to 2\ell_2 error ϵ\epsilon with probability of success at least 2/32/3, for ϵ\epsilon sufficiently small as a function of dd. In contrast, for such ϵ\epsilon, xx can be recovered to error ϵ\epsilon with probability 1o(1)1-o(1) with memory O(d2log(1/ϵ))O\left(d^2 \log(1/\epsilon)\right) using dd examples. This represents the first nontrivial lower bounds for regression with super-linear memory, and may open the door for strong memory/sample tradeoffs for continuous optimization.Comment: 22 pages, to appear in STOC'1

    Parallel Reachability in Almost Linear Work and Square Root Depth

    Full text link
    In this paper we provide a parallel algorithm that given any nn-node mm-edge directed graph and source vertex ss computes all vertices reachable from ss with O~(m)\tilde{O}(m) work and n1/2+o(1)n^{1/2 + o(1)} depth with high probability in nn . This algorithm also computes a set of O~(n)\tilde{O}(n) edges which when added to the graph preserves reachability and ensures that the diameter of the resulting graph is at most n1/2+o(1)n^{1/2 + o(1)}. Our result improves upon the previous best known almost linear work reachability algorithm due to Fineman which had depth O~(n2/3)\tilde{O}(n^{2/3}). Further, we show how to leverage this algorithm to achieve improved distributed algorithms for single source reachability in the CONGEST model. In particular, we provide a distributed algorithm that given a nn-node digraph of undirected hop-diameter DD solves the single source reachability problem with O~(n1/2+n1/3+o(1)D2/3)\tilde{O}(n^{1/2} + n^{1/3 + o(1)} D^{2/3}) rounds of the communication in the CONGEST model with high probability in nn. Our algorithm is nearly optimal whenever D=O(n1/4ϵ)D = O(n^{1/4 - \epsilon}) for any constant ϵ>0\epsilon > 0 and is the first nearly optimal algorithm for general graphs whose diameter is Ω(nδ)\Omega(n^\delta) for any constant δ\delta.Comment: 38 pages. v2 fixes a small typo in Section 4 found by Aaron Bernstein. v3 fixes some overflow issues. v4 fixes the proof of Lemma 5.1. We thank Aaron Bernstein for pointing this ou
    corecore